A discrete model for the evaluation of public policies: The case of Colombia during the COVID-19 pandemic

In mathematical epidemiology, it is usual to implement compartmental models to study the transmission of diseases, allowing comprehension of the outbreak dynamics. Thus, it is necessary to identify the natural history of the disease and to establish promissory relations between the structure of a mathematical model, as well as its parameters, with control-related strategies (real interventions) and relevant socio-cultural behaviors. However, we identified gaps between the model creation and its implementation for the use of decision-makers for policy design. We aim to cover these gaps by proposing a discrete mathematical model with parameters having intuitive meaning to be implemented to help decision-makers in control policy design. The model considers novel contagion probabilities, quarantine, and diffusion processes to represent the recovery and mortality dynamics. We applied mathematical model for COVID-19 to Colombia and some of its localities; moreover, the model structure could be adapted for other diseases. Subsequently, we implemented it on a web platform (MathCOVID) for the usage of decision-makers to simulate the effect of policies such as lock-downs, social distancing, identification in the contagion network, and connectivity among populations. Furthermore, it was possible to assess the effects of migration and vaccination strategies as time-dependent inputs. Finally, the platform was capable of simulating the effects of applying one or more policies simultaneously.


Introduction
Mathematical models are versatile tools for the study of epidemiological events. Such models can describe the transmission and control of pandemical infectious diseases, such as influenza and SARS [1,2]. In recent years, researchers have stressed the important of modeling diseases considering their natural history or the social dynamics of the population, e.g., the spatial heterogeneity of the population, the incubation and latency periods, the sources of contagion, and the recovery process [3,4]. Thus, the use of mathematical models is widespread, especially in designing and evaluating control policies [5,6]. some individuals are presymptomatic and might be contagious before developing symptoms [25]. Additionally, some of them could be asymptomatic and still transmit the disease. The detection of the asymptomatic population relies on two key strategies: (i) performing intensive COVID-19 tests over the population and (ii) identifying individuals who came into close contact with an infected person (contagion network) [26]). Despite the high saturation that COVID-19 has generated in health systems worldwide, the affected localities carry out reliable daily records based on clinical data to infer the disease trend through clinical data. Generally, such data comprises the positive, the death, and the recovered individuals.

Model assumptions and mathematical structure
We searched in the literature for a model that could serve as a basic structure. We chose the one proposed by [27] because it comprises the SEIR flow (susceptible, exposed, infected, and recovered) besides an environmental reservoir T. Due to the nature of the data (daily cases and death reports), we followed the advice in [2] to manage the modeling problem as a discretetype one. Also, to overcome homogeneous population assumption and include the social behavior in infection transmission, we included contagion probabilities, implemented a freecirculation and quarantine, and a diffusion system during the viral infection.

Free circulation, quarantine, and detection as components of contagion.
We treat quarantine (q) and free circulation (f) as a flow that affects the total population: the individuals in free circulation could enter quarantine and vice versa. We define q (quarantined) as the sub-index to indicate isolation and f (free) as the sub-index to indicate free circulation. Now, S q will denote the isolated susceptible populations, S f will denote those that remain in free circulation, and so on for other compartments as I. We refer to the flow between f and q as f Ð q. Isolation is relevant for the susceptible and infected since the lack of interactions protects them against contagion and isolated carriers can not spread the disease. Every compartment that belongs to the epidemic flow (e.g., SIR, SEIR, SIS) also allows for a partition into individuals of q and f kind (see Fig 1).
We define the flow state compartments as probabilities: λ qf is the probability of change from state q to f for the next time step, and λ fq the probability of the other flow direction. The population staying in q (e.g. S q or I q ) at time t + 1 only depends on λ qf , since infectious interactions are possible only for individuals in f (e.g. S f and I f ). Finally, we propose an extra flow, the infected detection j, which is the process of carrier identification (e.g. I j ), which can also be treated as a variation of the q (f Ð q|j flow). Thus, j behaves as an absorbant state as long as the individual recovers (R).
For free circulating populations, it is essential to formulate a function that describes the probability that a susceptible contracts the disease. Thus, we implemented novel contagion functions proposed in [28], as the reader can consult in section 3.1.1.

Diffusion systems in transmission models.
Some authors consider that the time elapsed from the moment an individual enters a compartment is a determinant for dynamics such as infectious capacity (pathogen load) or recovery from disease [29]. Some mathematical expressions that include such information during modeling are integrodifferential terms or large chains of coupled compartments with different output rates (Erlang models). However, both approaches are equivalent in certain epidemiological circumstances, as pointed out in [30]. Thus, we propose a discrete analog to the Erlang model approach, considering the f Ð q|j flow.
In Fig 1, we show how a SEIRD model turns into a directed graph where each node has an associated population at every time step. Individuals in node can move to another node on the next time-step, provided an edge connecting both nodes. Such displacements obey probabilities, but we treat them as proportions, the discrete equivalent of rates. We can model some of the flows between compartments as random walks, i.e., through a matrix approach. We called the set of compartments that admit random walk as a diffusion system. Thus, it is possible to represent the entire state as a vector where each position corresponds to a node. To estimate these proportions we introduced to the model a parameter related to scaling (γ) and two parameters related to the shape of a beta distribution function (a, b), as described in detail in section 3.1.2.

Mathematical and biological considerations during modeling.
To formulate the model, we considered the social and biological behavior of the spread of COVID-19 over a

PLOS ONE
population. However, we limited the scope by stating some assumptions which we summarize in the 15 items in Box 1.

Parameter estimation, validation and methods for parameters variation
We took COVID-19 data for several localities in Colombia and periodically updated them by fitting the proposed model. Also, for each locality, we perform parameter estimation and validation process. Then, we implemented it on a web platform to simulate different control scenarios. In this paper, we present the fitting results for six localities: (i) the whole country, Colombia, with 50 11. Recovered individuals do not lose immunity in short periods.
12. The population under study has no interactions with external populations.
13. The population is not homogeneous, instead, it could be represented as clusters of densely connected individuals.
14. There are only two infectious sources: living carriers and the environmental reservoir. All the infection probabilities are independent with each other.
15. Population renewal is negligible for short periods.

PLOS ONE
Thus, we fit the model for an ongoing pandemic considering the parameter variation over time, since some social parameters may vary between localities and periods. We divide the time series of the real data into some fragments to deal with abrupt changes (lock-downs, celebrations, among others), calling each fragment an extension. All the parameter estimations of the model for each locality were achieved through multiple extensions. This process depends on the socio-economic behavior in each locality. We illustrated this methodology in section 3.3.1 and S1 Appendix. For each extension, we perform parameter estimations and validation processes using the final values of the previous extension as initial values of the subsequent extension. We performed parameter estimation by fitting the model to experimental data several times, using the following daily time series: the number of (i) current active cases (not accumulated), J, (ii) accumulated recovered cases, R j , and (iii) accumulated deaths, D. As the optimization algorithm we chose the inner point approach implemented in GSUA_CSB Toolbox, available in [31], under the fmincon function in Matlab2021a. The reader can consult linear restrictions for the optimization process coming from parameter intervals in Table 1. We used a cost function to measure the goodness of fit based on mean squared error and the Pearson correlation coefficient, also implemented in GSUA_CSB [31].
For model validation, we implemented different methodologies such as sensitivity (SA), uncertainty (UA), and practical identifiability (IA) analyses, as described in [32]. We described the algorithm proposed in S1 Appendix for parameter estimation and validation for each extension. All the implemented routines and methodologies, are available in a GitHub repository 1 [33].

Results
We divide this section into three major results: first, we describe mathematical preliminaries of contagion probabilities, diffusion systems, their corresponding compartments, and how they merge into a compartmental model. Then, we present the model fitting for different affected localities in Colombia. Finally, we describe the implementation of the model on a web platform to evaluate the effects of public policies to control the diseases in some localities of Colombia.

Preliminaries: Mathematical structure for COVID-19 model
The proposed model involves: (i) multiple quarantine and free circulation flow, (ii) contagion probabilities, and (iii) diffusion systems. For more information and details, we describe the complete mathematical expressions and their corresponding dynamics in section 3.2. First, we will describe the left-right compartment flow of the model as shown in the orange rectangles in Fig 1. The first flow goes from S to E, and the compartment E is divided into two levels to include a delay in the exposed state. In this way, the latency period is at least two days, and only individuals in E 2 can become infectious at time t + 1 (where the super-index represents the infection day). Then, following assumptions 3, 5, and 6, we split I into three different states: (i) presymptomatic P, constituted by those individuals who have no symptoms but can transmit the disease; (ii) low-symptomatic L, constituted by those individuals who developed mild symptoms after the incubation period had passed; and (iii) severe or high symptomatic H constituted by those individuals who developed severe symptoms after finishing the incubation period.
We include the possibility of becoming low or high symptomatic during the presymptomatic step. Thus, we split the flow from E 2 to P, into P H and P L ; where the super-script in the P case suggests the severity of the symptoms that individuals will develop (low and high). Then, we define flows from P L to L and P H to H, respectively. Finally, when the population becomes H or L, they enter the diffusion systems described in section 2.2.2. Even so, note that only H case has a probability of death and both, L and H, have a probability of recovery.

PLOS ONE
We can define different sets of populations inside the model: the total alive population N = S + E + I + R, with the total population of susceptible S = S f + S q ; total exposed individuals Total infected population I = P + L + H, and the total presymptomatics Finally, we define H and L as all the individuals belonging to those diffusion systems described in section 3.1.2.
3.1.1 Contagion probabilities overcoming homogeneous assumption. From assumptions 1, 3, and 4, it follows that infectious interactions that involve P f are less likely to result in new infections than the interactions involving L f . We define the probability of infection after infectious interaction with presymptomatic, β P , and after interaction with symptomatic, β L , treating each type of interaction as a different process. Considering assumption 2, a natural expression for the number of interactions that the infected individuals has with the susceptible population is zI f S f /N, being z the number of average interactions that a regular individual has by time-unit. In this way, the presymptomatic contact function would be given by where the expression @(�) represents the heterogeneity of the population (@(�) � 0 is a homogeneous population), and it is introduced to decrease the probability of contact whenever I f increases or S f decreases. ν is the parameter of intrinsic isolation. We define the symptomatic contact function as  Finally, to define the probability of infection from an environmental source, let T(t) represents the size of the reservoir at time t and set k L as the mean contribution to the reservoir of a typical symptomatic carrier L(t) per unit of time. It would be impossible to avoid contact with the reservoir if all the individuals become infected. Then we assume that a reservoir size of k L N(t), where N(t) is the size of the whole population, represents an extreme scenario, i.e., that T(t) � k L N(t). Now, assuming that k L N(t) � k L N(0), we consider that the probability of having contact with the reservoir F T must be equal to the numerical closeness between T(t) and k L N(0). So, we defined the contact function with the environmental source as To achieve the related contagion functions, it only remains to multiply each contact function by its respective probability of infection (β P , β L , β T ). From assumption 14, the probability of not being infected would be the multiplication of the complements of infection probabilities given by the contagion functions.

Diffusion systems in transmission models with quarantine compartments.
According to assumptions 7, 8, and 9, we divided the classical compartment L and H into time-since-event compartments, because we can represent them as flows, as in [15,34]. First, we assumed to have a sampling time unit of one day and set a maximum number of days-since-event to model m within the selected compartment. Then we represented by sub-compartments (node with level) L τ and H τ where t 2 f0; :::; m À 1g, the population of individuals that remain infected after τ-days since they became infected. We refer to each sub-compartment as a node, each compartment as a state, where the super-indices (τ) as the level of the node (related to disease time development). On the other hand, sub-indices will indicate the type of the node (f, q, or j).
The advantage of introducing multiple levels for symptomatic states, resides in the fact that we can define the probability of recovery from disease, the probability of dying because of the disease, and the capacity of disease transmission, as functions of the number of days since symptoms development (similar to the case of integrodifferential terms for continuous-time case).
However, in this paper, we focus on recovery and death functions. To estimate those functions we define three parameters: a and b as the shape parameters of a beta-distribution function, and γ 2 [0, 1] as a parameter of scale. Then, we split the domain of the beta-distribution function into m equally spaced points to achieve a discrete function and normalize the values of such discrete functions to fit in the interval [0, γ], being 0 the lowest value of the discrete function and γ the highest value. Such beta-shaped function evaluated at each point will give the probability of recovery or death for the corresponding level (see Fig 2).
We found in the literature that the virus could remain for 41 days or less for some reported cases [35]. Hence, we set m ¼ 41 and define each recovery function (for L and H) and death function (for H) as explained previously, i.e, through a parameterized and discrete beta-shaped function. In this way, we get three types and 41 levels for state L, for a total of 123 nodes, and a single type with 41 levels for H (41 nodes). Once the nodes have been established and numerated, it only remains to define random-walk matrices to easily model the flow inside L and H states. We enumerated the nodes as can be seen in Fig 3 and chose the following priority of events: recovery > dead > f Ð q|j. Priority means that an individual who recovers cannot die,

Fig 2. Example of two cases for diffusion systems.
We propose discretized beta-shape functions to assign a probability for recovery or death in each level. Assigning an enumeration for the nodes and a priority for flows, it is possible to define correspondent random walk matrices. https://doi.org/10.1371/journal.pone.0275546.g002

PLOS ONE
and so on. Thus, we defined a matrix for each flow and multiply them to achieve the random walk matrix. For instance, the matrix W L 1 in Fig 3 represents the f Ð q|j transversal flow; the matrix W L 2 represents the process of recovery from disease for L; W H 1 represents the process of dying from disease for H; and W H 2 the process of recovery from the disease for H. Finally, the random walk matrix for L is given by and the random walk matrix for H is given by which is strictly triangular superior. In sections 3.2.4 and 3.2.5, we define mathematically the diffusion systems, which include a beta-distribution function to describe the probabilities to flow from one compartment to another one.

The mathematical models: COVID-19 case
In this subsection, we describe the mathematical model presented in Fig 4, together with their mathematical expressions for each compartment and diffusion system. In Table 1, we describe the states and parameters implemented in the model with their feasible ranges and units.

Susceptible state (S).
We divide the susceptible population into two sub-compartments: those in free circulation, S f , and those in voluntary quarantine, S q . Event probabilities and their complements define the flow between compartments. For example, the number S f is given by the probability of not becoming infected from any infectious resource , and do not entered into quarantine (1 − λ fq ) plus those who left quarantine (λ qf ). Similarly, S q depends on the probability of entering quarantine due to not becoming infected and leaving quarantine. But, if there is an infectious interaction in S f , they become exposed and pass to the E compartment. The successive compartments follow the same idea of probabilities and their complement, even with more events, as we will see below.
3.2.2 Exposed state (E). Exposed individuals could circulate freely (E f ) or stay in quarantine (E q ) with probabilities λ fq and λ qf ; they have not developed enough viral load to be

PLOS ONE
infectious. Some individuals can stay over two days in this subsystem because the flow from E 2 to P is given by e À � EP . Those individuals who have already completed the latency period become P L or P H .

Low and high presymptomatic states (P)
. E passes to P depending on the severity of the symptoms they develop. Note that individuals in presymptomatic compartments have enough viral load to infect a susceptible individual through direct contact, and they also contribute to the viral load in the environmental reservoir. The flow from P to L and H are defined by e À � PL and e À � PH , respectively which represent probabilities of transitions between the compartments.
• Low presymptomatic (P L ): we defined them as individuals that would develop mild symptoms or no symptoms. We assumed that the disease effect of this population does not cause death. Also, this compartment has sub-compartments that represent free-circulation, P L f with probability λ qf , quarantine or self-isolation, P L q with probability λ fq , and identified, P L j . The last one is a new sub-compartment, which causes obligatory isolation because the identification by the health system. We described it with probabilities ϑ E (for those new presymptomatic) and ϑ P (for those circulating presymptomatic). Finally, δ is the proportion of individuals that will develop severe symptoms and 1 − δ is its complement.
• High presymptomatic (P H ): As in the P L compartment, we divided this compartment into three sub-compartments as (i) free-circulation, P H f , (ii) quarantine, P H q , and (iii) identified, P H j ; with similar probabilities as exposed in the P L compartment case.

Low symptomatic state (L).
Individuals in this compartment have sufficient viral load to transmit the disease and manifest mild symptoms or be asymptomatic. They will not die but will recover with some time-since-infection probability. Individuals in L can move between the free circulation and quarantine states or compartments. Also, they could be identified and isolated in L j . Low symptomatic carriers have a greater viral load than presymptomatic ones; thus, they contribute more to the environmental concentration of the virus.
We took into account some parameters related to COVID-19 control. First, we modified flows from f to q (λ fq ) and from q to f (λ qf ) for L, by defining a parameter η L and implemented it as l Z L fq and l 1=Z L qf . Note that the higher the value for η L the more similar the behavior of L and P. In contrast, when η L ! 0, the entire population in L f + q ≔ L f + L q moves and remains in L q unless they are detected. Also, we assumed the identification of presymptomatic individuals to be lesser than the identification of symptomatic ones. Hence, we proposed that W Z W P percent (where η ϑ 2 [0, 1]) of the population in L f+q is detected and moves into L j for the next time step (whether they do not recover). Note that all the people in P that do develop severe symptoms are immediately detected as they flow into H j . Further, from assumption 7, the L has f, q, and j compartments, and the only compartment available for H is j.
As mentioned in section 3.1.2, we consider three types and 41 levels for the L state for a total of 123 nodes (or sub-states). Instead of writing down those 123 state-equations we defined a vector with 123 positions (one for each node) and a random walk matrix. However, it is quite difficult and unintuitive to present our results based on the position of that vector. So, we proceed as follow: let ℓ 1 (�) be a function that receives a node and returns its level, and ℓ 2 (�) a function that receives a node and returns its type. Also, let N s denote the starting node for the flow, N a the arrival node for that flow, and let Q the condition ℓ 1 (N a ) = ℓ 1 (N s ) + 1, for all N a and N s in the diffusion system. Flows within the diffusion system can be modeled through the function

PLOS ONE
while inside/out flows of L can be modeled by being the W L 2 ðN s Þ the discrete beta-shaped function we mentioned in section 3.1.2 and g γ (x) a beta-distribution function parameterized by a L and b L .
It is worth to mention that the incoming flow of a node becomes the total content of the node for the next time step. That is because all individuals in time t change their level for time t + 1. For instance individuals in level k � 1 and type f for time-step t + 1 would be given by where the sum does not consider any other levels than k − 1 because of the condition Q in function W L 1 . The expression L kÀ 1 w ðtÞW L 1 ðL kÀ 1 w ; L k f Þ in the multiplication above gives us the proportion of individuals in the starting node that will move into the arrival node for the next time-step following the f Ð q=j transversal flow. Multiplication by ð1 À W L 2 ðL kÀ 1 w ÞÞ allows us to only consider those individuals that do not recover. On the other hand, it is clear that It only remains to note that the first level for L would run empty for the next time-step unless we consider the income of new low-symptomatic individuals. Thus L 1 (t + 1) would be given by Note that we include three main strategies for the identification of the carriers of COVID-19: (i) tracing of contagion networks, i.e., the isolation of individuals who had suspected contact with an identified carrier; (ii) the identification of those presymptomatic carriers who eluded the first strategy. Finally, (iii) the isolation of those carriers that do exhibit symptoms. We model such strategies by including three parameters for the identification of infectious individuals. To comply with such assumptions, we proposed a parameter ϑ E that represents people in E moving into P j because the public system traces the contagion networks. Now, ϑ P represents people in P f+q = P f + P q , that do not develop symptoms but move into P j , which models the second strategy. For the third strategy, the identification of symptomatic carriers to be greater than the identification of presymptomatic ones, we proposed that W Z W P (where η ϑ 2 [0, 1]) of the population in L f+q (analog to P f+q ) move into L j .

High symptomatic state (H).
Individuals in H have enough viral load to transmit the disease and they develop severe symptoms; thus, the health system rapidly identifies and isolates them, so there is a single compartment H j in which these individuals stay in mandatory isolation. We assume these individuals can die or recover with time-since-infection probability. It follows that the diffusion system for H does not have flows of the form f Ð q/j, and therefore random walk matrices will be made up of probabilities of death and recovery. Functions that consider the characteristics of H, analogous to those given in the subsection 2.2.4, are defined as follows: where the functions g μ (x) and g H (x) are parameterized by beta-distribution functions defined as Function W H 3 ðN s Þ will be used in the following sections to define inside/out flows. In the same way, it must be clear that the income of new high-symptomatic individuals is given by while the total of high-symptomatic individuals is given by Finally, as an example, the individuals in level k � 1 for time-step t + 1 would be given by which denotes the proportion of individuals in H k−1 (t) that do not die and do not recover.

Recovery and death states.
The disease dynamics finish with recovered individuals from both L and H, and dead individuals from H. These are two separate compartments, and are similar in structure because they acts as sinks.

Environmental reservoir.
Assume we have an environmental reservoir of the pathogen constantly renewed by emissions from the infected individuals I(t). Susceptible individuals S(t) do have contact with contaminated surfaces at some rate, and then they might get infected. The contact rate is a non-decreasing function of the reservoir size, and without emissions from I(t), the reservoir would run out. That is the basic idea introduced in [27] to formulate a continuous-time model. We proposed a discrete-time equivalent for that idea defining a threshold from an extreme scenario and setting the behavior of the equation according to that threshold.
In this way, we defined T(t) as a compartment for the virus concentration in the environment according to the contributions of infectious micro-droplets expelled by P and L individuals in free circulation. The environmental reservoir can remove the virus from the system by a flow e À � T . The susceptible individuals could become infected during direct interaction with this reservoir, as exposed in section 3.1.1. We set T(t) � k L N(t) because our threshold scenario is the one where all the population becomes infected and symptomatic. Hence, a natural expression for this compartment is given by where k P is the contribution to the reservoir from P and k L is the contribution of symptomatic carriers to the reservoir. Further, the relation between k P and k L turns into how much symptomatic individuals contribute to the environmental reservoir more than presymptomatic. Then, we assumed that the increment is related to the viral load within the carrier. The viral load for a member of L could be from 1 to 10 times greater than that of a member of P according to [40].

Additional equations for model fitting.
When we perform parameter estimation, it is ideal to have the greatest amount of information available to make the model fit. For example, reference intervals for the parameters to be estimated and real data to which the model will fit: time series of the active cases, recovered, and dead (equivalent to node D) identified. Note that no explicitly available node is equivalent to said time series in the model; however, it is possible to propose equations that allow the fit without increasing the complexity. For instance, we propose in Eq (4) the following equations for model fitting: J as the number of active detected carriers at time t, and R J as the number of detected carriers that have recovered at an specific time.
3.2.9 Simulation of complex phenomena. We identified migration, vaccination, and immunity loss, as complex phenomena with a high impact on the design of public policies. However, we gave up estimating their impact from available data because of the lack of information we would need to validate those estimations and the complexity shift they would cause in the model structure. Then, for the first two, we decided to include them as parameterized model inputs. We modeled such inputs as time series that indicate the impact of the input over the model at each time step. On the other hand, we treated immunity loss as a pulse that allocates a fraction of the recovered individuals into the susceptible ones at the start of the simulation. We based this last decision in the concept of extension, which we mentioned in detail in section 2.3. Next, we briefly introduce the parameterizations we proposed to simulate those phenomena.

Migration:
We considered the migratory process as a flow of individuals towards the system. We assumed that symptomatic individuals refrain from migrating, which is a consequence of socially correct behavior. Then, by definition, the migratory flow must consist of susceptible, exposed, and recovered individuals in free circulation. To parameterize time-series generation for migration, we defined seven parameters distributed as follows: two parameters for shape (a m , b m ), one parameter for scale (γ m ), two parameters for proportion of infectious and recovered migrants ðg m I ; g m R Þ, and two parameters for initial and final time of migration ðt m 0 ; t m f Þ.
By using the shape parameters we compute a normal distribution function f(x) with domain restricted to the interval defined by is then scaled into the interval [0, γ m ] defining a new function using the following transformation The function h(x) allows us to achieve a time series when evaluating it at every time step of the simulation. This time series does have the shape of a discrete normal distribution function and a maximum value given by γ m . Finally, we were able to take into account migration slightly modifying the equations for S q , E 2 , and R as follows. The relation g m I þ g m R � 1 holds.
2. Vaccination: Given the nature of the proposed model, we modeled the vaccination as a flow of people from compartment S to compartment R. In this way, we were able to treat vaccination as a time series that represents the number of susceptible individuals becoming immune to the disease at each time step, i.e., they acquire the status of recovered. To parameterize this time series, we followed a similar strategy that the one we designed for the recovery/mortality probabilities in diffusion systems (see section 3.1.2). To model vaccination we defined seven parameters as follows: two shape parameters for a beta distribution (a v and b v ), one parameter for vaccine effectiveness (γ v ), one parameter for strategy of vaccination (δ v ), two parameters for the initial (t v 0 ) and the final time (t v f ) of vaccination program, and a last parameter for the total amount of vaccines (v). The total number of effective vaccines would be given by vγ v . We used time parameters to set the domain of the beta distribution function. Finally, we chose a parameter of scale k v such that and N j (t) the total amount of detected individuals at time t (including the recovered ones that were previously detected). In this way, the vaccination time series would be given by the function

PLOS ONE
that we included in the system modifying the equations for S f , S q , and R in the following way The meaning of the strategy of vaccination parameter δ v is the proportion of vaccines to be dosed that requires a displacement of the individuals to some place, which makes them behave as individuals in free circulation, regarding the vaccines to be dosed without exposure of the individuals to infectious sources. Note that proportion S(t)(N(t) − N j (t)) −1 models the uncertainty involved when applying vaccines without knowing if the receptor is a real susceptible individual. This expression might be removed according to the particular context of each locality.
3. Immunity loss: Reinfection or cross-immunity, is a phenomenon we expected to happen because of the biological properties of the disease alongside its pandemic status. We aimed to model immunity loss by modifying the recovered compartment to flow again into the susceptible one. Hence, we proposed a function Υ(t) that allocates a small proportion of R into S (given by immunity-loss parameter k I ) at the beginning of the simulation for each extension (see section S1 Appendix for model extensions) and we modified equations for recovered and susceptible individuals to include Υ(t) as follows Rðt þ 1Þ ¼ � � � À ΥðtÞRðtÞ: One could expect parameter k I to stay close to 0 for each extension. Therefore, we recommend keeping it as 0 until most of the population recovers ( RðtÞ NðtÞ � 0:75) or the estimation algorithm starts giving poorly feasible parameters (suggesting a high effect of immunity-loss in the dynamics of the disease). We included λ qf /(λ qf + λ fq ) and λ fq /(λ qf + λ fq ) in equations above because we assumed that there is an implicit f Ð q flow within the recovered compartment. Being R absorbant, it follows that those terms represent the proportion of individuals in the f and q sub-states, respectively.

Model implementation for Colombia case
As mentioned in section 2.3, we initially validated the model through UA, SA, and practical IA methodologies described by [32]. During the validation process, we focused on two localities with their initial outbreaks: Colombia and Hubei-China; further information about the description of the results is available at GitHub repository 1. We identified some relevant differences in parameters for social behavior in both localities that suggest Hubei presented a proficient control over the population to avoid disease spread. Also, we highlight that identification of parameters related with tracking of carriers (η ϑ , ϑ P , and ϑ E ), quarantine and connectivity parameters (λ fq , λ qf , and ν) have high importance inside the variance output.
After the corresponding validation (see GitHub repository 1), we implemented the model for different localities in Colombia, where we estimated parameters for all departments of Colombia and their capitals, alongside other specific localities with touristic and economic importance. Thus, we constantly updated 71 localities in Colombia using the algorithm proposed in S1 Appendix from March 2020 until February 2022. In Fig 5, we present six examples of model fitting for some affected localities using three outputs: active cases, dead, and recovered. For example, (A) the entire country and (B) the capital in which we can identify four demarcated outbreaks in actual cases. Also, different localities are big cities, economically important cities, or border cities: (C) Medellín, (D) Barranquilla, (E) Leticia, and (F) Cali. Note that the localities of Bogotá, Medellín, Cali, and Barranquilla hold the whole country trend. Barranquilla has a lower second outbreak compared with the mentioned localities. In contrast, Leticia had one marked peak, followed by small oscillations.
Each curve fit is composed of different pieces of estimations that we defined as extensions in section 2.3 and estimated them using the algorithm described in S1 Appendix (see Fig 6). For the time series of Colombia, Medellín, Bogotá, Leticia, Cali, and Barranquilla, we divided them into 19,18,16,15,21, and 27 extensions. For the value and result of each extension, we present validation information available in .pdf for each locality in a GitHub repository 2.
3.3.1 Web platform: An evaluation of public policies. We step forward from the design and fitting of the mathematical model to its implementation in a web platform called Math-COVID. It is a platform open to the public that was used as a reference by the epidemiological units in Colombia for all the locations mentioned in this study. Authorities associated with the control implemented the model in the platform to evaluate the effects of public policies on epidemiological indicators (see S2 Appendix). The link to the platform is available in GitHub repository 2.
In the platform, we implemented the parameters that inherently describe control policies as sliders to allow users to evaluate different strategies to compare with the model forecast given by the current parameters. The changes in the parameters are presented in Fig. S4 Appendix as follows: • Processes of removing the virus from the environment (ϕ T ), as cleaning surfaces or closed areas.
• The lockdown process by increasing or decreasing the probabilities of going outside or inside quarantine (λ qf and λ fq ). We could define this process as an intermittent strategy (three days of restricted circulation).
• Connectivity level inside populations (ν), i.e., can simulate the restriction of the availability of public transport between neighborhoods or localities.
• The number of average interactions between individuals (z) as those established in public transport, supermarkets, schools, or at job places.
• The government capacity to track contagion networks (ϑ E ), identify asymptomatic and symptomatic carriers (ϑ P and η ϑ ), and finally, the likelihood that people with symptoms will self-isolate (self-care, η L ).
• Include migration as an input given by a parameterized time-series that models the addition of new individuals to different compartments, as described in section 2.

PLOS ONE
• Include vaccination programs as an input given by a parameterized time-series that models the number of vaccines applied to individuals per unit of time. Vaccinated individuals that gained recovered status according to the effectiveness of the vaccine. We refer the reader to section 2 for further reading.
Modeling COVID-19 in this way has allowed us to identify possible bad scenarios and simulate outbreaks of COVID-19 in Colombia and some of its localities in the last two years. We present the simulation and the long-term behavior of the system according to model fitting in Figs 7 and 8 for multiple outbreaks in the country. For example: Scenario (A) is reached by increasing the parameter λ qf value (from 0.17 to 0.33) to simulate the social dynamics of the Christmas celebration in Colombia in which the people tend to free circulate triggering the fourth outbreak produced by the appearance of the Omicron strain in Latin America [41,42]. Note that the curve obtained during this simulation meets the trend of real data in Fig 5, a dynamic behavior also validated in S4 Appendix with Montecarlo simulations. In the following figures, we simulate scenarios with a high flux of people alongside some public policies that could decrease the number of cases of the fourth outbreak. In scenario (B), we present the intermittent lock-down strategy described as pulseseries, in which λ fq and λ qf are as in S3 Appendix, i.e., people tend to go out every five days and then lockdown.
Scenario (C) in Fig 7, show the simulation of a high flow of people because of Christmas (λ fq = 0.33) alongside an increase in the identification programs through parameters ϑ E (from 0.11 to 0.15) and ϑ P (from 0.55 to 0.60) that represents an increasing of identification programs effort. Following this strategy, we can obtain a smaller outbreak than the one obtained in scenario (A) and even in the real data presented in Fig 5. On the other hand, in scenario (D), we increased the value of ν (from 35 to 150), reducing the movement inside the country, representing a limitation of people from moving to other localities. It has greater impact among the presented scenarios. In scenario (E), we present the case of a considerable migration for a month, in which susceptible people gradually incomes to the country. We can see the effect in the long term, in which detected cases increase considerably after the migration.
We present the vaccination effect in scenario (F) (see Fig 8). We simulated a historical situation of Colombia before implementing mass vaccination (until March 2021) and after it (simulation until July 2021); note that the trend of the simulation with vaccination is similar to the real process carried out in the country during July and August 2021 (see Fig 5).

Discussion
Since the emergence of the COVID-19 pandemic, some researchers have approached the disease transmission and control study using mathematical models of the SIR − SEIR-type [3,4]. However, simplified models exhibit theoretical gaps related to transmission dynamics, e.g., latency or incubation periods, lack of quarantine as a transversal flow, social distancing, and changes in detection strategies [3,4,43,44]; especially for designing and evaluating public policies. Therefore, we developed a model structure and a calibration methodology that remedy these gaps by adapting mathematical expressions for social and epidemiological dynamics. For example, an f Ð q|j flow that represents the quarantine and isolation behavior, contagion probabilities that overcome homogeneous dynamics, and diffusion systems for recovery and death dynamics (see section 3.2). All those components together allowed us to model COVID-19 and the effect of public policies.
Our major result is the formulation and validation of a mathematical model developed to simulate public policies using, as a case study, the COVID-19 data. We validated the model alongside parameters and confidence intervals according to different criteria (UA, SA, and practical IA available in GitHub repository 1). The importance of validating the model using the UA, SA and practical IA consist in determining that all proposed parameters are useful and representative of the disease transmission and control in the model. Thus, we implemented the model in several localities with different socio-cultural behavior that share the same data type (infected, recovered, and death cases). For example, Colombia, Bogotá, Medellín, Cali, Barranquilla, Leticia (see section 3.3) and so on presented in GitHub repository 2.
The COVID-19 dynamics in Colombia was similar for Bogotá, Medellín, and Barranquilla; those correspond to large cities with high connectivity between other localities in the country (see Fig 5). For these localities, the first outbreak started after relaxing the quarantine policies. The following outbreaks were related to the increase in contact numbers during Christmas vacations and returning to high-density places (schools, universities, public transport, and jobs). In Cali, there is a similar trend, in which there are multiple outbreaks, but they are lower than the outbreak mostly caused by the Omicron strain, the increase of contacts and the leaving quarantine because of Christmas. For Leticia, the case was completely different. This locality has an explosion of COVID-19 cases at the beginning of the pandemic, then after which there were no new outbreaks, and becoming endemic.
We highlight that the proposed model fits the epidemic dynamics in each study zone; these results suggest progress in fitting transmission models based on real data. For example, as [13] mentioned, some Erlang structures had an issue with identifying and estimating the distribution function related to the diffusion systems, which [45] suggested for future work in the area. We overcame this issue through the estimation method proposed for the complete model, related to the updating process in section 3.3 and the S1 Appendix. We illustrated this process with the study cases of localities in Colombia and China. The calibration of mechanistic models must be constantly updated and recent to capture the disease dynamics and design public policies, specially studying diseases with fast spread as COVID-19 [7].
In section 3.3, we mentioned different policies and their combinations to evaluate and forecast the effect of public policies on an affected population. Through the validation process, we established that the quarantine-related parameters (λ fq and λ qf ), carriers identification parameters (ϑ P , ϑ E , and η ϑ ), and connectivity inside the population (ν) represent the major variance in the output response of the model. Multiple authors discussed this result, concluding that quarantine, regulation in public transport, social distancing, and the use of a mask are valuable strategies to control an epidemic [12,46,47]. These policies allowed us to get information about the control conditions to flatten the curve of cases or deaths [8].
Society learned that the zero-COVID policies could suffocate social and economic activities [8]. Thus, it is crucial having interdisciplinary work and expert review during policies design and evaluation using the pros and cons, because people must agree with the policies to shape COVID-19 cases and deaths [8]. We implemented the proposed model on a platform web with more localities to develop comparative studies according to the researcher's interests. We extended the analysis performed for Colombia to other localities through an interactive tool that decision-makers around Colombia could implement since July 2020. This platform worked as a baseline for designing and analyzing public policies during the most crucial moments in the COVID-19 pandemic in Colombia in 2020-2022. We focused on developing a friendly user interface in which a non-expert mathematician could vary the control parameters in the model and understand how the variations of those parameters could affect the COVID-19 dynamics.
Using Colombia's data as study case, we presented control scenarios using the information before the Omicron outbreak (see Fig 7); we could identify the different social behaviors and the results in reducing the mentioned outbreak mainly caused by Christmas celebrations. In the first scenario (A), we changed the probability of getting out of quarantine to simulate an out-coming flux due to Christmas celebrations. The scenarios (B) to (D) were created as alternative policies to reduce the number of cases, for example, by implementing intermittent lockdowns, increasing identification programs, or reducing the homogeneous dynamics of the population. The migrant simulation in scenario (E) shows the dynamics of Colombia after receiving a high quantity of migrants into the susceptible population and the final simulation in scenario (F) shows the vaccination process.
Note that we introduced the vaccination and migration as processes in the model as an input instead of a parameter because, for Latin American countries, the vaccine availability was not clear by February 2021. Therefore, we considered it convenient to solve this problem by introducing this process as a time-series input; because it allowed us to evaluate the effect of vaccination in reducing the incidence of COVID-19 in Colombia, as can be seen in Fig 8. Also, we could generate other scenarios that combine vaccination with other public policies associated with quarantine, and the use of masks, among others. Additionally, large-scale vaccination is a process that requires multiple parameters to include and estimate. Hence, developing vaccination as an input reduces the complexity produced by the number of parameters to be estimated, which can generate a large uncertainty in the model output.
Without using a well-known time series to define vaccination input during the estimation process, we could identify some parameters that mitigate the absence of a direct incidence of vaccination. This can be identified in populations with extensions before and after the vaccination started in Colombia. For example, the parameters ϑ E , z, a L , and b L presented changes can be seen in GitHub repository 2 for estimations in Colombia before and after March 2021. Even so, note that the model could capture the dynamics of the disease in Colombia and its localities.
Simulating single or multiple control scenarios offers a useful tool for researchers that involves decision-makers. Thus, we highlight that the proposed model can be adapted according to the natural history of other diseases and countries, i.e., we can use it to model other SEIR-type diseases because it is adequate for transmission not associated with COVID-19. For instance, it is possible to create a model without an environmental reservoir and adapt the diffusion systems to describe diseases like influenza or pneumonia. Also, we can add compartments related to vector populations and model the transmission of vector-borne diseases. Thus, it is possible to estimate the under-reported cases through parameter estimation for any target diseases and populations provided there is enough data to fit the model [48].
In the mathematical modeling process, some characteristics prevail: the meaning associated with the parameters and the model fit achieved to the phenomenon [49]. Many mathematical models try to describe, with a degree of generalization or simplification, a real phenomenon in such a way that they can present fitting problems. For example, a classical SIRD model can not adjust to the dead data through a multiobjective optimization problem of fitting positive, dead, and recovered cases of China [50]. Therefore, the complexity of describing the real process must be evaluated, as well as other modeler objectives as the development of control measures, such as public policies associated with epidemiological events. In our case, we considered incorporating different public policies and the information available for this disease in Colombia (identified cases, recovered cases, and deaths). We highlight that because of the availability and quality of the data registered for COVID-19 in Colombia, it was possible to develop a detailed model, presented in section 2.2.
The complexity of the proposed model must be linked to the data availability and the main interests of the researcher. For example, we focused on the behavior of the general population to develop and validate a method for modeling public policies. Even so, because of the agerelated severity in COVID-19 [51,52], the mathematical structure proposed in this paper could be implemented to model meta-population as age-divided groups by defining and linking multiple models for the age-structures [53]. Also, if data by age groups is available, parameter estimation for the age groups could be carried out as future work in the area. We encourage the development of a formal study of the computational time required for the model calibration and validation. Even so, the time taken for model fitting satisfied the necessities required by the decision-makers in Colombia. Finally, note that in this paper, we focused more on identifying the effect of public policies than developing and studying R 0 and R t measures, considering the difficulties exposed in [54].

Conclusion
We have shown the potential to model SEIR-type dynamics using a structure that combines contact probabilities for heterogeneous populations, a quarantine, and free circulation flows from Erlang model theory, and diffusion systems that describe death and recovery dynamics. This potential was evidenced by fitting the proposed discrete model in more than 70 localities in Colombia for the first two years of the COVID-19 pandemic (see GitHub repository 1 and GitHub repository 2).
Also, we gave a step further by implementing the model into a web platform as a tool for helping the decision-makers to evaluate the effects or to design public policies such as lockdowns, social distancing, identification of contagion networks, migration, vaccination, and connectivity among populations. Local governments and institutions, such as the Universidad EAFIT, implemented this platform to identify when and how often to reintegrate into their professional activities and reduce its effects on the active cases. Also, some economic activities in the country, such as the Coffee Growing Area, implemented the model in 2020 to identify the epidemic dynamics and evaluate the control strategies effects. Finally, we observe that future researchers could implement the proposed method and model structure for other countries and diseases.